feat: stateless watcher implementation#1
Open
imajus wants to merge 1 commit into
Open
Conversation
- Async REST client wrapping /add, /update, /delete, /cognify, and dataset listing with name lookup; respects 5xx/429 with tenacity exponential backoff. - Watcher loop using watchfiles awatch + per-burst debounce, collapses add->delete sequences and emits a single /cognify per dataset per batch. - Reconciliation sweep on boot and on a configurable interval to repair drift from missed events. - Path-to-name encoding bijection (replace _ with _U, then / with __). - Env-var configuration via pydantic-settings; hard-fails on missing required. - Structured JSON logs via python-json-logger. - Dockerfile and CLI entrypoint (cognee-watcher / python -m cognee_watcher). - Tests cover encoding round-trips, REST client via respx, and operation routing via an in-memory fake client. 27 tests, ruff clean. Not yet tested end-to-end against a real Cognee instance.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
First implementation pass of the cognee-watcher per DESIGN.md. Parked pending the Cognee deployment so we don't lose the work.
Scope
src/cognee_watcher/modules:config(pydantic-settings env validation),log(JSON structured logs),encoding(path↔name bijection:_→_U,/→__),client(async REST wrapper with tenacity retries on 5xx / 429 / transport errors),watcher(awatch loop, debounce, op routing, singlecognifyper burst),reconcile(boot + periodic drift-repair sweep),main(CLI entry, SIGTERM handling, wiring).tests/— 27 tests via pytest + respx (HTTP mocking) + in-memory fake client. Encoding round-trips, REST client error paths, watcher routing (add / update / delete / batch / ignore-globs / collapse).pyproject.toml(hatchling, deps: httpx, watchfiles, tenacity, pydantic-settings, python-json-logger),Dockerfile,.env.example,README.mdquickstart.Verified locally
pytest -q→ 27 passedruff check .→ cleancognee-watcherboots and hard-fails with a clear error on missing required envNot yet verified (blocks merge)
GET /api/v1/datasets/{ds}/data?name=…filters server-side (client falls back to scanning anyway)PATCH /api/v1/updateacceptsdataId+datasetIdas multipart form fields alongside the file/add//update//data(current code tolerates both raw list and{data: […]}envelope)Done when